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■ Abstract 
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I We present an object-oriented geophysical and astrophysical spectral-element adap- 

tive refinement (GASpAR) code for application to turbulent flows. Like most spectral- 
element codes, GASpAR combines finite-element efficiency with spectral- method 
accuracy. It is also designed to be flexible enough for a range of geophysics and 
astrophysics applications where turbulence or other complex multiscale problems 
arise. For extensibility and flexibilty the code is designed in an object-oriented 
manner. The computational core is based on spectral-element operators, which are 
represented as objects. The formalism accommodates both conforming and non- 
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conforming elements and their associated data structures for handling interelement 
communications in a parallel environment. Many aspects of this code are a synthesis 
of existing methods; however, wc focus on a new formulation of dynamic adaptive 
refinement (DARe) of nonconforming /i-type. This paper presents the code and 
its algorithms; we do not consider parallel efficiency metrics or performance. As a 
demonstration of the code we offer several two-dimensional test cases that we pro- 
pose as standard test problems for comparable DARe codes. The suitability of these 
test problems for turbulent flow simulation is considered. 

Key words: spectral element, numerical simulation, adaptive mesh. AMR 



1 Introduction 

Accurate and efficient simulation of strongly turbulent flows is a preva- 
lent challenge in many atmospheric, oceanic, and astrophysical applications. 
New numerical methods are needed to investigate such flows in the parameter 
regimes that interest the geophysics communities. Turbulence is one of the last 
unsolved classical physics problems, and such flows today form the focus of 
numerous investigations. They are linked to many issues in the geosciences, for 
example , in meteorology, oceanography, climatology, ecology, solar-terrestrial 
interactions, and solar fusion, as well as dynamo effects, specifically, magnetic- 
field generation in cosmic bodies by turbulent motions. Nonlinearities prevail 
in turbulent flows when the Reynolds number is large. The number of 
degrees of freedom (d.o.f.) increases as Re^^^ for Re 3> 1 in the Kolmogorov 
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(1941) framework. For aeronautic flows often Re > 10^, but for geophysical 
flows often Re » 10^; [29, 10] for this and other reasons the ability to probe 
large Re, and to examine in detail the large-scale behavior of turbulent flows 
depends critically on the numerical ability to resolve a large number of spatial 
and temporal scales. 

Theory demands that computations of nonconvective turbulent flows re- 
flect a clear separation between the energy-containing and dissipative scale 
ranges. Uniform-grid convergence studies on 3D compressible-flow simula- 
tions show that in order to achieve the desired scale separation, uniform grids 
must contain at least 2048^ cells [35] . Today such computations can barely be 
accomplished. A pseudo-spectral Navier-Stokes code on a grid of 4096^ uni- 
formly spaced points has been run on the Earth Simulator [18], but the Taylor 
Reynolds number (oc is still no more than Ri 700, very far from what 

is required for most geophysical flows. Similar scale separation is required in 
computations of convective-turbulent flows. However, if the flow's significant 
structures are indeed sparse, so that their dynamics can be followed accurately 
even if they are embedded in random noise, then dynamic adaptivity seems 
to offer a means for achieving otherwise unattainable large a/Rb values. 

We have developed a dynamic geophysical and astrophysical spectral-element 
adaptive refinement (GASpAR) code for simulating and studying turbulent 
phenomena. Several properties of spectral-element methods (SEMs) make 
them desirable for direct numerical simulation (DNS) or large-eddy simula- 
tion (LES) of geophysical turbulence. Perhaps most significant is the fact 
that SEM are inherently minimally diffusive and dispersive. This property 
is clearly important when trying to simulate high Re number (low viscosity) 
flows that characterize turbulent behavior. Also, because SEMs use a finite el- 



3 



ement as a fundamental description of the geometry [31], they can be used in 
very efficient high-resolution turbulence studies in domains with complicated 
boundaries. Such discretizations also enable SEMs to be naturally paralleliz- 
able ([e.g., 15]), an important feature when simulating flows at high with 
many d.o.f. involving multiple spatial and temporal scales. Equally important, 
spectral element methods not only enjoy spectral convergence properties when 
the solution is smooth but are also effective when the solution is not smooth 
[9] (see (A.7) below). 

In the case of SEMs, conforming adaptive methods (where entire element 
edges geometrically coincide, as in Fig. 1) are gradually being replaced by 
nonconforming adaptive methods. One reason is that mesh generation for 
conforming methods is complicated when attempting to resolve local flow fea- 
tures [32]. Another reason is that adaptive conforming meshes can lead to 
high-aspect-ratio elements that can cause difficulties for a linear solver [12]. 
Moreover, the fact that nonconforming elements can better localize mesh re- 
finement implies that the computational cost among all elements can be re- 
duced [23, 30]; with conforming adaption, however, local refinement regions 
may extend out to where refinement is not dictated by local features of interest 
within the solution. 

Nonconforming element discretizations can be geometrically and/or func- 
tionally nonconforming. In the former case (Fig. 2), neighboring elements have 
boundaries that do not coincide; in the latter, the polynomial expansion degree 
p in neighboring elements can differ. Several SEM researchers have adopted 
the discretization method that simultaneously alters element size h and con- 
figuration (/i- refinement) and the polynomial degree p in neighboring elements 
(p-refinement), providing for a so-called /i-p-refinement strategy. The mortar 
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Fig. 1. Schematic of a conforming degree p = 2 mesh showing the mapping of global 
(i.e., unique) d.o.f. in the problem domain ID (left) to local (i.e., redundant) d.o.f. in 
the elements E^. (right). Edge subscripts give element index k and edge index from 
s = (south edge) counterclockwise to s = 3. The element Ei is bounded at the 
east by the edge dKi^i and E2 is bounded at the west by the edge 9E2,3 = dKi^i. 
The interface matching condition occurs by simple assignment leading to a Boolean 
assembly matrix Ac. 

element method (MEM) [1, 4, 25] is a nonconforming discretization method 
that uses a variational formulation to minimize the Lebesgue L2 norm of the 
discontinuous jump across nonconforming spectral-element boundaries. In a 
number of recent applications, this method has been used in unstructured 
(static, nonadaptive) simulations of turbulence [16] and for ocean dynamics 
[19, 24]. This method has been shown to produce optimal convergence in the 
incompressible Stokes equations solution [3], and it has been demonstrated 
experimentally to produce excellent results when used as a basis for adaptive 
mesh refinement [28]. 



Nonconforming h-p adaptive methods using MEM have been developed for 
studying turbulent phenomena [17], ocean modeling [24], flame front deforma- 
tion [11], electromagnetic scattering [22], wave propagation [6], seismology [7] 
and other topics. However, MEM for p-type refinement has been cited as caus- 
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Fig. 2. As in Fig. 1 but for a geometrically nonconforming (but functionally con- 
forming — each element has p = 2) mesh. Here E2 and E3 are bounded at the west 
by "child" edges 5E2,3 and ^£3^3, and Ei is bounded at the east by the "parent" 
edge (9Ei^i = 9E2,3 |J ^^3,3- The mapping occurs by means of an interpolation of 
global d.o.f. from the function space associated with the parent edge onto the union 
of those associated with the child edges, which contains the function space of the 
parent. 



ing instabilities in flow calculations [32]; for this reason the interpolation-based 
method was developed. Also, in most flows of interest to us, it is the different 
scales' interaction that determines not only the structures that form but also 
their statistics and time evolution. This suggests that reasonably high-order 
approximations are required in each element during much of the evolution. 
Thus, in the present work, using the fact that the numerical solution order is 
relatively high, we restrict ourselves to a nonconforming /i-refinement strategy 
only and use an interpolation- based scheme to maintain continuity between 
nonconforming elements [12, 23]. (Throughout the remainder of this paper 
"nonconforming" will refer to geometrically nonconforming elements, keeping 
the expansion degree p fixed in all elements.) Researchers have recently used 
this nonconforming treatment as a basis for performing DARe [34]; however. 
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to the best of our knowledge, our implementation of this interpolation scheme 
in the fully dynamic adapt ivity context is unique. 

Our purpose in this paper is to describe GASpARand, in particular, the 
procedures used in the DARe technique. We first describe (Section 2.1) SEM 
discretization on a particular class of problems and introduce many of the 
required formulas, operators, and so forth. We discuss (Section 2.3.1) the type 
of nonconforming discretization allowed and the way in which continuity is 
maintained between elements sharing nonconforming interfaces. We explain 
the linear-solver details in Section 2.4. In Section 2.5 we consider how noncon- 
forming edges are identified and how neighboring elements are found, we and 
present the rules for refinement and coarsening. In Section 2.5.2 we present 
the gather/scatter matrix that facilitates communication of edge data. A pri- 
ori error estimators are discussed in Section 2.5.3 as a means to provide DARe 
criteria. Then, in Section 3 we consider two test-problem classes: solutions of 
the heat equation and linear advection-diffusion equation (Section 3.2), which 
demonstrate feature tracking of smooth and isolated features governed by 
dynamics of linear systems; and the Burgers equation, that tests the ability 
of DARe to capture and track well-defined and reasonably sharp structures 
(Section 3.3) that arise from nonlinear dynamics. In Section 4 we offer some 
conclusions on our current work, as well as some comments on the application 
of GASpAR to geophysical turbulence simulations. 

2 Methodology 

Because we wish to focus on the DARe methodology of GASpAR, we con- 
centrate on the simplest multidimensional nonlinear equation that encom- 
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passes many of the difficulties in simulating fluid turbulence. Thus in this 
section we present the discrete form for the 2D Burgers equation and its vari- 
ants. In turn we present the spatial discretization using a variational formula- 
tion and the spectral-element operators that arise. We then present the time 
discretizations. 

2. 1 Discretization of the dynamics 

The equations considered in this work derive from the d-dimensional advection- 
diffusion equation for velocity u{x, t): 

dtu + c-Vu = uV^u, (1) 

where u oc Re~^ is the kinematic viscosity. This is to be solved in a spatiotem- 
poral domain {x, t) e D x ]0, tf] subject to the boundary and initial conditions 

u{x, t) = b{x, t) for (f , t) edBx ]0, ti] , (2) 
u{x,0)=Ui{x) for feD. (3) 

In this work, c may be u, so that (1) is the Burgers equation, or c = c{t), a 
prescribed uniform linear advection velocity. 

2.1.1 Variational approach to spatial discretization 
Define the function space 

Ug= |m = J^^M^"^ l^^e tf(D)V/x & M = 6on5©|, 
where denotes the Cartesian unit vectors and 

tf(D) = {/ |/eL2(D) & a,./eL2(D) V/x}. 
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Then the discretization of (1) is based on the following variational form: Find 
the trial function u{-,t) e Ug such that for any test function v 

{v, dtu) + {v, Cu) = -u (v/, Vu) , (4) 

where C = c • V is the advcction operator and the inner product is (A. 23). 
Note that the treatment of (3) will not be made explicit but may be easily 
inferred from the general discussion. 

We assume that D can be partitioned as in (A. 15). (See the appendix for the 
complete mathematical details.) To discretize (4), we adopt a Gauss-Lobatto- 
Legendre (GLL) basis (A. 20), that is, expand and using (A. 18). Inserting 
these expansions into (4), we arrive at the semi-discrete ODE system problem: 
Find the numerical solution u,^{-,t) = 4>^u{t) £ Vh,p^h ^^^^ that for all v — 

v^M— + v^Cu = -uv^Lu (5) 

Ci.t' 

collocated at KN^ mapped Lagrange node points (A. 17), where M = diag;^ M^., 
C = diag^ Cfe, and L = diag^, are the unassembled block-diagonal mass ma- 
trix, linear or nonlinear advection matrix [cf. 9, Ch. 6], and diffusion matrix, 
respectively. The respective dN^ x dN^ matrix blocks locally indexed to ele- 
ment Kk are 

M^^, ^ ,)^^ = S,j,5^'^'w,,„ (6) 

including the quadrature weights (A. 22), 

where ^jk(i) = clxf^kjt), and 

/"GJ 
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To compute V0j;jk, one differentiates (A. 4) and uses (A. 19). The weak diffusion 
matrix for deformed quadrilaterals or deformed cubes (nonlinear maps 'dk) can 
also be constructed (e.g., [9]). While these elements are supported in GASpAR, 
they are not necessary for the present discussion. 

Restricting ourselves for the moment to the kth element E^, we see that 
(5) must hold for the restriction vIe^ = (p^Vk of v to any E^, so that the 

— * 

unassembled ODE for ■Unli^, = (p'kUk is 

Mjfc-^ + CfeWfc = -iy\-kUk. (8) 

Strictly speaking, (8) is true only after being assembled in the manner dis- 
cussed in Section 2.3.2. Continuity of Un across all elements is a sufficient 
condition for maintaining u'^ G E[^(D). We allow two element configuration 
types, conforming and nonconforming, as illustrated in Figs. 1 and 2, respec- 
tively. Conforming discretizations enforce continuity simply by assigning the 
same Un values to the coinciding node points Xj-^k = ^j',fe' along element edges 
dKk^s — dE.k>^s'- In the nonconforming case ^E^^^ C dE.k\s' a-nd functional 
continuity cannot be accomplished by simple assignment because most node 
points are not coinciding; thus, we use an interpolation-based scheme to en- 
force continuity along a nonconforming interface. This scheme is the subject 
of Section 2.3.1. 

2.1.2 Time discretization 

While there are many time discretization schemes (e.g., [5]), we restrict our- 
selves to semi-implicit multistep methods. In all these methods the diffusion 
is solved fully implicitly while the time-derivative is approximated using a 

backward-difference formula (BDF) of order Mbdf (see [9, 21]) and the advec- 
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tion term is approximated by using an explicit extrapolation-based method 
(Ext) of order M^^t [20]. Then, to integrate (8) from time f""^ to time i", one 
has 

n— 1 n— 1 

»kK- E «Mr<- E Crcr<, (9) 

m=n— Mbjf m=n— Mext 

where 

= + z/L^^ (10) 

is a discrete spectral-element Helmholtz operator. Although the matrices 
and Mk in (8) were i-independent, they are time-indexed in (9) because DARe 
will, in general, reconfigure the partition (A. 15) over time. For this reason the 
coefficients are computed for each as in the traditional schemes cited 
except that the timestcp At may vary with m as the smallest spectral-element 
diameter h = miufe/i^ (A. 16) changes. As discussed below, continuity is 
maintained during the linear solve of (9) assembled over k for the solution 
tt". Because the matrix H*^ = diag^ is symmetric positive-definite (SPD), 
provided that is restricted to U^, the solution of the assembled (9) is 
obtained by using a preconditioned conjugate gradient (PCG, see [33, 38]) 
algorithm to invert H" at the time step to t". In the presence of nonconforming 
elements, care must be exercised to ensure that the search directions in the 
PCG algorithm are in Ug. We consider appropriate modifications to PCG in 
Section 2.4. 

2.2 Implications for code design 

The fully discretized advection-diffusion equation (9) suggests several issues 
that we have taken into consideration when designing the code. First, all 
geometric (mesh) information is separated from all other code objects, since 
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element type information can be encoded easily into the objects that require 
this distinction. Second, the solution data must be available at multiple time 
levels, so it is reasonable to provide this information in a data structure. Thus 
we are led to the notion of element and field objects. The former contains 
all geometric information in d dimensions, including the values and tensor- 
product ordering of the Gauss-quadrature nodes (A. 17) and weights (A. 22). 
The element object also contains neighbor-list information and an encoding 
of the hierarchical element refinement level oc logi h^. of each element E^. The 

2 

field object contains the data representing the physical quantity of interest 
at all required time levels f^. 

We note that, while we concern ourselves with the advection-diffusion equa- 
tion (1) here, the code will allow any equation that is discretized by using a La- 
grangian basis on up to two meshes (important, for example, when discretizing 
the Navier-Stokes equations by using a formulation coupling the polynomial 
spaces Vp-Vp_2 [26]). The basis functions and the ID derivative matrices and 
Gauss-quadrature nodes (A. 10) and weights (A. 14) are encapsulated within 
basis classes (objects), and the SEM operators such as (6,7,10) above are con- 
structed as objects that contain pointers to the basis objects and to a local 
element object. Generally the d-dimensional SEM operators are not stored but 
are constructed from a tensor product application of the relevant ID objects. 
High-level objects encapsulate the solution of (8) or other equations, such 
as the Navier-Stokes equations, and have common interfaces that allow the 
equations to take a single time integration step; in other words, all high-level 
equation solver classes are used in the same way. All the higher-level objects 
are constructed with "smart" arrays or hnked lists of elements and fields that 
are independent of the objects that solve the equations. Hence, the classes 
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that handle DARe and enforce continuity between elements being solved. The 
high-level objects also contain linked lists of SEM operators that do depend 
on the equation being solved. 



2.3 Local application of spectral- element operators 

In general, global meshes consist of multiple-element grids. In this subsec- 
tion, we make explicit the form of the SEM operators implemented in the code 
that ensure the solution continuity across element interfaces. 

2.3.1 Continuity between nonconforming elements 

In a conforming treatment, C° continuity is maintained between elements 
by ensuring that the function values on the coinciding nodes are the same. The 
matching condition, then, consists of expressing the A^g global (unique) d.o.f. 
Wg in terms of the local (redundant) d.o.f. as c^A^-vectors Ufc, A; e {1, • • -K}. 
Generally Ag < KdN^. This mapping can be accomplished by using a KdN^ x 
Ag Boolean assembly matrix Ac, such that 

u — AcUg. (11) 

For example, if our mesh partition is that in Fig. 1, we can easily see that 
(11) takes the following explicit form (suppressing zero- valued and ii > \ 
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blocks) : 



'10 
10 
1 




1 
1 
1 



u = 




U8,l 



10 
10 
1 




1 







1 
1 



1 







1 
1 



V 



1 









In practice, Ac is never formed explicitly but is instead applied. The matrix 
Ac, which assigns global d.o.f. to the local d.o.f., is also called a scatter matrix. 
Its transpose A^ performs the associated gather operation. 

In our nonconforming treatment, we use an interpolation-based method for 
establishing the interface matching conditions between neighboring (noncon- 
forming) elements. The underlying function space (Section 2.1) remains 
the same, unlike in MEM. Consider the nonconforming mesh configuration 
depicted in Fig. 2. For the moment denote the global nodes, those nodes re- 
siding on the east parent edge (?Ei_i, by Xg^j, i e {2, 5, 8}, and denote the nodes 
on the west child edges, 9D2,3 and ^Dg^g by f^, j e {9, 12, 15, 18, 21, 24}. We 
explain elsewhere how in the weak formulation of (1) or other PDEs, the test 
function factors (j)g,i{x) arise that are continuous across the global domain D, 
and interpolate from the global node values ^ [14] . A continuous solution is 
then found in span^ 0g_i by projecting there from the space of trial functions 
(f)j{x) that interpolate from the local node values xj but that do not need to 
be globally continuous. The matrix block 4>g^i{xj) of A thus generalizes the 
Boolean scatter matrix used in the conforming-element formulation and ac- 
commodates both conforming and nonconforming elements. It is convenient 
to factor A = $Ac, where $ is the interpolation matrix from global to local 



14 



d.o.f. and Ac is locally conforming. 

For example, the explicit form of (11) for the nonconforming assembly ma- 
trix A = #Ac for the mesh illustrated in Fig. 2 is (suppressing zero-valued 
and /X > 1 blocks) 



u 



U8,l 
"0,2 



^8,2 



\US,3J 



1 













3/8 








ODD 




3/4 



1 



-1/8 





1 
1 









1 





-1/8 3/4 3/8 












1 







1 / 

1/ 



3/8 3/4 -1/8 

000 000 000 

000 000 000 

-1/8 3/4 3/8 

000 000 000 

000 000 000 




1 







1 
1 



1 





1 ; 

1 / 



Note that the A entries corresponding to the child- node rows (see Fig. 2) 
are not Boolean but that every row sum is unity. This result is to be expected 
because A must accommodate interpolation of a constant solution (e.g., Ug^i — 
1 Vi) across a nonconforming interface. 



15 



2.3.2 Global assembly 

To accommodate Dirichlet boundary conditions (2) into the solution, we 
employ a masking projection which is locally diagonal with unit entries ev- 
erywhere except corresponding to nodes on Dirichlet boundaries, where there 

— * 

are zero entries. Any field (f)^u — u E may be analyzed ets u — Uh + u^, 
where = ^FlAcUg constructs the projection Uh = (p^U]^ G Ug of u, that is, 
its homogeneous part, and Wb = w — Wh constructs G Uj, which vanishes 
at the interior nodes xy^k ^ dli of D. Inserting this analysis into (5) (noting 
v e U(j =^ i; = $nAci;g) and repeating the time discretization leading to (9), 
we arrive at the following linear equation to solve for Wh at each time step: 

v^Hu^v^f Vi;g=^ A^n$^H$nAeWg = A^n#^(/-Hub), (12) 

where we have ascribed all explicit past-time references from the time-derivative 
expansion and the advection terms in (9) to /. Equation (12) is solved by us- 
ing the PCG algorithm. While (12) shows explicitly that the matrix on the 
left is symmetric nonnegative-definite, it is not in a form easily solved on a 
parallel system. Left-multiplying (12) by $nAc and letting 

Z = ^HAcA^n*^, (13) 

we get the following local form: 

ZHuh = Z(/-Hub). (14) 

The direct stiffness summation (DSS) matrix Z is coded so that the gather 
and scatter are performed in one operation (see Section 2.5.2), which reduces 
communication overhead on parallel systems [36]. 

In addition to fl we must introduce the inverse multiplicity matrix W to 
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maintain H^(D) continuity (see Section 2.4). This matrix is diagonal, com- 
puted by initializing a collocated vector g^j. = 1 VJ, k, /i, setting child face 
nodes to 0, performing g <— $AcA^$^gi, then setting 



1 9j,ki if ^j,k — ^]',k' coincides with a global node, 
0, otherwise. 
For example, corresponding to Figs. 1 and 2 the diagonals of W are 



(11^11^11^-11-11-1 11 

^1 2' ' ' 2' ' ' 2' 2' ' ' 2' ' ' 2' ' / 



and (1, 1, 1, 1, 1, 1, 1, 1, 1,0,1, 1,0, l,l,0,i,i,0,i,i,0, 1,1,0, 1,1), 



respectively. 



2.4 Linear solver 



The modifications to the PCG algorithm required to solve (14) in the non- 
conforming case stem from the requirement that the iteration residuals r and 
the search directions w correspond to functions f = and w = (f)^w be- 
longing to H^(D)'^. The CG algorithm searches the global d.o.f. space for the 
solution to the linear equation. So that we may continue to use the local matrix 
forms, however, we must also mask off all Dirichlet nodes (if any exist), which 
are not solved for. The assembly matrix masks off these nodes in such a way 
that the new search direction w e EI^(D)'^. Additionally, in all cases in the CG 
iteration where a quantity g must remain in HI^(D)'^, we explicitly "smooth" 
it by gf Sgf, using an smoothing matrix S = ^flAcA^W. The result 
of the operation is a quantity that is interpolated properly to the child edges 
and that is expressed precisely once (unlike in the DSS case) at multiple local 
nodes that represent the same spatial location. A similar smoothing matrix 
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Wh = 

r = Z(/ - HSwb) 



W7 = 



// initialize homogeneous term 

// initialize residual 

// initialize search vector 

// initialize parameter 



Loop until convergence: 

e = SPV / 

Po = Pi, Pi = T^We / 

w ^ e + wpi/pQ I 

r' = ZHio / 

a = pi/w'^Wr' I 

Uh 1th + (^'^ I 

r ^ r — ar' i 



1 1 error estimate 

/ / update parameters 

/ / increment search vector 

/ / image of w 

j j component of increment 

/ / increment along w 

II increment residual 



End Loop 



u = Sf(wh + Wb)- 



Fig. 3. PCG algorithm 



Sf = $AcA^W, where the Dirichlet mask is not inserted, is used to smooth the 
final inhomogeneous solution. Note that it is critical that the inhomogeneous 
boundary term Mb G H^(D)'^ in (14); thus, the smoothing matrix S is applied 
to Ub before the Helmholtz operator is. However, the nonsmoothed boundary 
term must be added after the convergence loop in order to complete the solu- 
tion. Note also that the final smoothing operation follows the addition of the 
boundary condition and therefore cannot be masked; hence the distinction of 
the final Sf matrix. 

2.4-1 Modified preconditioned conjugate-gradient algorithm 

With these considerations we present in Table 3 the PCG algorithm for the 
assembled local problem (14) for conforming elements [see 9, Sec. 4.5.4] mod- 
ified for nonconforming elements. Preconditioning is handled by the matrix 
P~\ In general, the preconditioned quantity must be smoothed. 
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2.4-2 Preconditioners 

Much investigation remains to determine the optimal preconditioners 
for the equation solvers supported in GASpAR. However, the preconditioning 
strategy is not the present work's focus. Nevertheless, for the purpose of tur- 
bulence study we, make some general comments to guide the development of 
optimal preconditioners. If we consider the expression (10) for the spectral- 
element Helmholtz operator, we note that u oc Re~^ is very small. Also, because 
we always use relatively high spatial degree, the timestep At ocl/ is very 
small. Hence, we see that is strongly diagonally dominant. Thus, approx- 
imating with its diagonal entries alone is a reasonably good choice for a 
preconditioner, and this is used in the present work. 

2.5 Adaptive mesh formulation 

As mentioned in Section 2.1.1, the global domain D is initially covered 
(A. 15) by a set of disjoint (nonoverlapping) elements E^. Each of these initial 
elements becomes a tree root clement, which is identified by a unique root key 
kj. e {1, 2, 3, • • • } for that tree. At each level £ e {-^min, • • ••^max}, an element 
data structure provides both its own key k and its root key k^. For any level i, 
the range of 2'^^ valid element keys will be A; e 2'^^{kj. -|- 1) — 1] because 

the refinement is isotropic (that is, it splits an element at the midpoints of 
all its edges to produce its 2^^ child elements) . Conversely, we obtain the level 
index from the element key using 

e=liog,4k/k,)\. (15) 
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In order to ensure all keys are unique, given a root the next root is k'^ — 
2'^^^{k^ + 1), and so on. 

After elements E^. are identified ( "tagged" ) for refinement or coarsening at 
level £, three steps are involved in performing DARe: (1) performing refine- 
ment by adding a new level of 2'^ child elements E2cifc, • • •E2d(fc+i)_i at level 
£ + 1 to replace each E^, or else coarsening 2'^ existing children E^, ■ ■ ■ E;j_,_2<i_i 
into a new parent 'E,\j^i2d^] (2) building data structures for all element bound- 
aries, which hold data representing global d.o.f. and accept gathers {tCu val- 
ues) or perform scatters (Aug values); and (3) determining neighbor lists for 
data exchange. Neighbor lists consist of records (structures) that each con- 
tain the computer processor id, element key A;, root key k^ and boundary id 
s e |o, • • • 2*^ — l| of each neighbor element that adjoins every interface. In 
refining or coarsening, the field values for each child (parent) elements are 
interpolated from the parent (child) fields. For simphcity, the interior of each 
element boundary is restricted to an interface between one coarse and at most 
2'^~^ refined neighbors. Thus, at most one refinement-level difference will exist 
across the interior of an interface between neighboring elements. 

In GASpAR, the data structures that represent global d.o.f. at the inter- 
element interfaces are referred to as "mortars." These structures are not to 
be confused with the mortars used in MEM; however, they serve as templates 
for that more general method. Recalling Fig. 2 as a paradigm, in general the 
mortars contain node locations and the basis set of the parent edge (in 2D, 
or face in 3D). These structures represent the same field information for the 
parent and child edges; their nodes coincide with the parent edge's nodes, and 
they interpolate global d.o.f. data to the child edges, as described above. The 
mortar data structures are determined by communicating with all neighbors 
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to determine which interfaces are nonconforming. This communication uses 
a voxel database (VDB) [16]. A voxel database consists of records containing 
geometric point locations, a component id that tells what part of the element 
Ejfc (edge dEk,s, vertex e 9^Efe_s, etc.) the point represents, an id of the element 
that contains the point, that element's root id, and some auxiliary data. Two 
VDBs are constructed: one consists of all element vertices, and one consists of 
all element edge midpoints. With these two VDBs, we are able to determine 
whether a relationship between neighbor edges is conforming and also deter- 
mine the mortar's geometrical extent. The VDB approach can also be used for 
general deformed geometries in two and three dimensions, as long as adjacent 
elements share well-defined common node points. 

The algorithm classes that carry out refinement operate only on the element 
and field hsts. The SEM solvers adjust themselves automatically to accommo- 
date the lists that are modified as a result of DARe. 

2.5.1 Refinement and coarsening rules 

The method that actually carries out the refinement and coarsening takes 
as arguments only two buffers: one containing the local indexes of the ele- 
ments to be refined, and one with indexes of elements to be coarsened. While 
the refinement criteria that identify elements for refinement or coarsening are 
described in Section 2.5.3, we point out here that before mesh refinement or 
coarsening is undertaken, the tagged elements are checked for compliance with 
several rules. For refinement, we have the following rules: 

rl. The refinement level must not exceed a specified value. 

r2. No more than one refinement level may separate neighbor elements. 
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These rules must be adhered to even in the case of interfaces that he on 
periodic boundaries. The refinement hst is modified to remove any elements 
that would violate rule rl, and rule r2 is enforced by tagging a coarse element 
for refinement if it has a refined neighbor that is in the current refinement list. 
This is most easily effected by building a global refinement list, consisting of 
the element keys of all elements tagged for refinement. Each local element's 
neighbors are then checked; if a neighbor is in the global refinement list and 
it is a nonconforming neighbor, then the local element must also be refined. 

We may not coarsen an element if under any of the following conditions: 

cl. The element is the tree root. 

c2. Any of its 2*^ — 1 siblings are not tagged for coarsening. 
c3. The element appears in a refinement list. 
c4. Refinement rule r2 would be violated. 

To enforce rule c4, we introduce the notion of a query-list, in other words, a 
global hst of records of each element key k, its parent key [k/2^\ , and its refine- 
ment level £ (15). The global list is built from local buffers of element indexes 
that are then gathered from among all processors. The following procedure is 
then used: 

1. Build a query-list (RQL) from the element indexes in the refinement list. 

2. Find the maximum and minimum refinement levels £max and £min repre- 
sented among all keys tagged for coarsening. 

3. Reorder the current local coarsen list from ^max down to ^min- 

4. Working from i = ^max down to imin, 

i. Build a query-list (CQL) from the element indexes in the current 
coarsen hst. 
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ii. For all elements in the local coarsen list at the current £, check that 
(a) any refined neighbor is in the CQL and (b) no refined neighbors 
are in the the RQL. If both conditions are met at this £, then the 
local element index is retained in the current coarsen list. Otherwise 
it is deleted. 

5. Do a final check that all elements in the local coarsen list have all their 
siblings also tagged for coarsening. This is done by building a query-list 
of all current coarsen lists and verifying that a local element's siblings 
are in the global list. Sibling elements are identified by having the same 
parent key as the local element. 

We note that in order to make the algorithm consistent, the local refinement 
lists are checked, and possibly modified before checking and modifying the 
coarsen lists. 



2.5.2 Communicating boundary data 

The mortar data structures contain all the data to be communicated be- 
tween elements during each application of the DSS or smoothing operation 
(13). GASpAR communicates element-boundary data by exchanging data be- 
tween adjacent elements, which necessitates network communication on paral- 
lel computers. This data exchange involves an initialization step and and oper- 
ation step. The initialization step establishes the required element/processor 
connectivity by performing a bin-sort of global node indexes and having each 
processor process the nodes from a given bin to determine neighbor lists. This 
method was suggested in [9, ch. 8] but to our knowledge has never been im- 
plemented. The procedure is to label the mortar-structure nodes with unique 
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indexes, generated by computing the Morton-ordered index for each geomet- 
ric node point. This index is computed by intcgralizing the physical-space 
coordinates and then interleaving the bits of each integer component to cre- 
ate a unique integer. Thus, all nodes representing the same geometric posi- 
tion will have the same index label. For P processors, a collection of bins 
B;, ie{0,---P — 1}, is generated that partitions the dynamic range (global 
maximum) of the node indexes. Each processor I partitions its list of node 
indexes into the bins, sending the contents of bin B;/ to processor /', where 
the data are combined with those from other processors and then sent back 
to the originating processor. After this step, a given processor is informed of 
which other processors share which nodes among all the mortar nodes. 

With the information gleaned from this initialization step, the operation 
step involves the communication of the data at any node point with all other 
processors that share that node. This data is extracted from the element in 
question by using the pointer indirection provided by the local-to-global map 
represented by the unique node index. The data at common vertices is summed 
for the DSS or smoothing operations and returned to the local node also by 
indirection. The algorithm provides that common vertices residing on the same 
processor are summed before being transmitted to the other processors that 
share the node, in order to reduce the amount of data being communicated. At 
the end of the operation step, the data at multiply-represented global nodes 
are identical. This gather-scatter procedure ensures that the DSS'ed data are 
available for local computation immediately after the final communication of 
data. 

One benefit of this method for performing the gather-scatter operations is 
that it allows the communication to separated from the geometry because the 
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global unique node ids are essentially unstructured lists of local data loca- 
tions. However, the method is somewhat inefficient in the current formula- 
tion of DARe in GASpAR. Neighbor lists for transmitting data may also be 
constructed by using the VDB. Since the VDBs are synchronized and thus 
represent, in a sense, global data, this same VDB can also be used to deter- 
mine a given element edge's neighbor lists. After the mortars are constructed, 
each element edge is associated with its neighbors' local ids and processor ids. 
Hence, the neighbor lists for handling element-boundary data exchange are 
determined easily from existing global data; there is no need to perform the 
initialization step in the gather-scatter method described above because the 
information is readily available from the VDB synchronization. 

2.5.3 Error estimators 

Elements are tagged for refinement or coarsening by using an a posteri- 
ori refinement criterion. One criterion, adapted from [27], uses the Legendre 
spectral information contained within the spectral-element representation to 
estimate the quadrature and truncation errors in the solution and to compute 
the rate at which the solution converges spectrally in each element E^. We 
refer to this as the spectral estimator. The discretized solution along ID lines 
in coordinate direction /i' (fixing the other coordinates) is represented as a 
Legendre expansion 

«'^(4(---,e'---)) = E«,%r'), (16) 

3=0 

where the Uj can be computed easily by using the orthogonality of Lj w.r.t. 
(A. 23). Then an approximation to the solution error along the line can be 
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expressed as 



f=-^ — 

=-est — 



\i:ij+iy (^.")^ (17) 

where the j = p term is the (over)estimate of the quadrature error and the 
j > p terms sum to the truncation error. Since we do not have Uj>p, we must 
extrapolate using the coefficients we do have. We assume Uj^p can be approx- 
imated by In I I pa InC^ — X'^j, and we determine and from a linear 
fit using the M final coefficients Up-M<j<p (16). With this continuous ap- 
proximation for the extrapolated coefficients, we approximate the truncation 
error term in (17) as an integral over j and integrate analytically to compute 
the total error e'^^. In this work, the maximum error over all Np lines for each 
/i' within Efc is computed and taken as the E^-local error estimate. The local 
convergence rate is simply the minimum value of |A'^| over all lines and all /i'. 
For all our test cases, M — 5. 

Thus, Efe is refined, if for some £est is above a threshold value or if jA'^l is 
below another threshold. For coarsening, for all n, all 2'^ sibling elements must 
have their e'^^s below some value proportional to the refinement threshold. 
This requirement prevents "blinking," where refined elements are immediately 
coarsened because the error tolerances are met on the refined elements. 

In conjunction with this spectral criterion, we can often obtain better overall 
accuracy-convergence results by checking whether the Ej^-maximum second 
derivative in any coordinate is above a certain threshold and by performing a 
logical OR of that condition with the spectral error threshold for refinement 
tagging. 

While the high degrees used in the expansions will help the spectral error 

estimator, other refinement criteria may be more efi'ective, given the variety of 
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solution structures arising in our applications. The investigation of refinement 
criteria appropriate for such intermittent features is a major outstanding prob- 
lem in numerical solution of PDEs, and one that we consider in a companion 
paper [14]. 



3 Test problems and results 

We have chosen a set of test problems that examine various aspects of (1). 
The primary objective is to investigate the temporal and spatial convergence 
properties of the solutions when adaptivity is used. For this purpose, we have 
selected test problems that have analytic solutions, so that the errors may be 
determined exactly, instead of only by comparison, for example, to a highly 
refined control solution. The test problems begin with the simplest aspect of 
(1) and continue to progressively more difficult problems until the behavior of 
the full 2D nonlinear version of (1) is considered. 

For each test problem, a BDF3/Ext3 scheme is used for the time deriva- 
tive and the advection term, respectively (Section 2.1.2). This requires that, 
at the initial time all the required time levels be initiahzed, m e 
{1, • • • max(Mbdf, Mext) — !}• Both the spectral and second- derivative error 
estimators are used for adaption criteria. The spectral error is normalized by 
the norm of the solution, ||tt°||p = VvP'^vP, at the start of the run, and the 
second x'^-derivative is normalized by ||it°||p/L^, where L is the longest global 
domain length. Elements are tagged for refinement based on a logical OR of 
the two criteria. The lA'^l threshold in all cases is In 10. 
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3.1 Heat equation 



For the linear case c = c{t) the analytic fundamental solution of (1) is a 
d-periodized Gaussian in D = [0, 1]'^: 



for t > -a(0)^/4z/ (^^(f, t) = otherwise), where a{t) = ^JcriQ^ + Aut, 
(j(0) = ^2/20 is the initial distribution e-folding width and f° = Ej=ir''/2 
is the initial peak location. To compute (18), we truncate summands of value 
be less than 10~^^ of the partial sum. The simplest version of (1) is the heat 
equation, where c = 0. The purpose here is to investigate temporal and spatial 
convergence of the adaptive solutions without advection. The initial condition 
at i = from (18) with o? = 2 is computed on K — Ax A elements, and the 
mesh is refined until the maximum allowed number of refinement levels may 
be reached. Both a spectral estimator and derivative threshold were used. The 
threshold and coarsening factor for each were set to be 10"^ (10~^) and 1 (0.5), 
respectively. 



3.1.1 Temporal convergence 

First, we consider time convergence of the adaptive solutions by integrating 
to a fixed time t = 0.05 for various fixed timesteps At. From (18) a curve of 
relative L2 error e — ||w — Wa||p/||Hallp '^^ generated for each of several 
maximum-refinement levels £max and for four degrees p. We present the results 
in Fig. 4, adopting the convention £- control for the grid that uniformly covers 
the domain with elements at the finest resolution in the ^max — ^ case. The 
BDF3/Ext3 is a globally third-order scheme, so we expect that if the solution 
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is well resolved spatially, we should see a slope of 3 in a log-log plot of 
error vs At. This is precisely what is seen in the figure; each plot consists 
of a sequence of four curves for the refinement levels ^max G {0, • • • 3}, where 
^max = implies that no refinement is done. For the spatially resolved curves 
in each plot, the error is linear with slope 3.14. 

We see in Fig. 4 that even at low p, the solution can be well resolved 
as long as refinement is used. We also see that as p increases, there is less 
need for refinement because the unrefined mesh is able to resolve the solution 
adequately, at least for the period of integration. 

To compare the various refinement levels, we have made the same runs using 
a 3-control grid. The ^max-control solutions are defined to be those generated 
on a nonadaptive grid comprising elements at the finest scale in the ^mal- 
adaptive case. Thus, in this problem, since we initialized with a iiT = 4 x 4 
grid, the 3-control grid consists of 2^4 x 2^4 or X = 32 x 32 elements. These 
plots are indicated by the dashed fines in Fig. 4, which all follow the ^max = 3 
curves. As the polynomial degree increases, the solution requires fewer levels 
of refinement to achieve the same accuracy as with the 3-control grid, at a 
considerable computational savings. 



3.1.2 Spatial convergence 

In this portion of the heat-equation test, we consider the effects of polyno- 
mial degree p on the solution. The maximum number of refinement levels is 
fixed to £max = 3. Here, a variable Courant-limited timestep 

At < K / max a H — b-m 

/ ie{i,-p},fce{i,-i^},/.e{i,-d} V(^^fc) / 
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Fig. 4. Plots of log^gd Ua| Ip/I \u^\ \p) for the heat equation vs fogio -^t, for different 
polynomial degrees p. Within each plot are four curves for the maximum refinement 
levels imax = (solid curve with circle markers), ^max = 1 (dashed curve with 
cross markers), ^max = 2 (dotted curve with diamond markers) , and -^max — 3 
(dash-dotted curve with square markers). The control solutions are indicated with 
dashed curves and follow closely the ^max = 3 curves. The axes in each plot have 
the same limits. Note that as p increases, the curves converge, for this range of At. 

is used with a fixed Courant number of k = 0.2, where hj j. = — ^j-i,k\ 
uses (A. 17). The solution is integrated to a time t = 0.5 chosen so that we 
observe the solution coarsening as it decays. The initial mesh is the same as in 
the time convergence test. The spatial convergence result is presented in Fig. 
5. 

We see in the figure the linear behavior characteristic of the spatial conver- 
gence in all our test problems. We expect that a solution converging spectrally 
as in (A. 7) will exhibit a straight line in a plot of log^odl** ~ ""-al |p/| l^^al |p) vs 
p, indicating that the error decays exponentially. Also plotted in this figure 



30 



6 

log P 



Fig. 5. Plots of logxo(||it — liallp/lli^allp)! ^ ^ function of p for the heat equation 
test. The square markers indicate the adaptive runs, while the diamonds represent 
the 3-control grid runs at the same polynomial degree. 
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Fig. 6. Plots of diagnostic quantities vs time in the p = 6 heat equation test. Top: 
Number of Elements vs time. Bottom: log;^o(ll''^ ~ ■"a||p/||''^allp) time. 
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are the 3-control runs, which saturate very quickly in this case but still pro- 
vide uniformly better errors than in the adaptive case except at the highest 
degrees. The cause is likely the elliptic nature of the problem in which the 
coarsening elements transmit their error throughout the grid. In Fig. 6 we 
can see that the error over time docs not decay monotonically except dur- 
ing periods where the grid is quasi-static. Keeping in mind that the decaying 
Gaussian contains all polynomial orders, one concludes that the solution error 
is globally influenced by the error from interpolations and from the inability 
of the truncated polynomial expansion at this degree to accurately model the 
decaying Gaussian. 

3.2 Linear advection equation 

In our next test we consider the linear advection equation (1) with d — 2 
and constant c=l^. This problem considers the ability of the code to simulate 
and follow a reasonably sharply localized translating distribution. The initial 
distribution is given by (18) at t — 0. For this test problem, we set u — 
10^^. The spectral error tolerance in this problem is turned off. The derivative 
criterion is set to 1.0 with a coarsening factor of 0.5. The [A'*! threshold is 
In 10. 

3.2.1 Temporal convergence 

The temporal convergence test is done in the same way as for the heat 
equation (Section 3.1.1). The total integration time ist — 0.06. We begin with 
a, K — 4: X 4: element mesh, each element of 2D degree p x p. In Fig. 7 we 
present our results. 
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Fig. 7. Plots of log;^Q(||ti — Wa| Ip/I Ip) for the advection-dominated problem vs 
log^o fo^ different p. Within each plot are subplots for each of four refinement lev- 
els, £max = (solid curve with circle markers), ^max = 1 (dashed curve with x mark- 
ers), ^max = 2 (dotted curve with diamond markers), and ^max = 3 (dash-dotted 
curve with square markers). The control solutions are indicated with dashed curves 
and generally follow the ^max = 3 curves. The axes in each plot have the same limits. 



For the spatially resolved curves in each plot, the slope of the curve is 2.95. 
Even at high degree p, the error is Z\t-independent for the unrefined mesh. For 
lower degrees, to obtain a case where the solution error decays at the order of 
the time-stepping method indeed requires a larger number of refinement levels, 
indicating that the solution is well resolved spatially only at these higher levels. 
Thus, in order to resolve this distribution properly so that the temporal error 
is 0{At^), refinement is necessary. 

Also plotted Fig. 7 are the 3-control runs corresponding to the adaptive 
solutions. These runs are indicated by the dashed lines (just visible in the 
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Fig. 8. Plots of log^gdlu — Ua||p/||i^allp) ^ ^ functioii of p for the linear advection 
test. The square markers indicate the adaptive runs, while the diamonds represent 
the 3-control grid runs at the same polynomial degree. These two curves are nearly 
identical. 

top-right plot), which all follow the £max = 3 curves. As the polynomial degree 
increases, the solution requires fewer levels of refinement to achieve the same 
accuracy as with the 3-control grid, again at a considerable computational 
savings. 

3.2.2 Spatial convergence 

We next consider the effects of expansion degree p on the solution error. The 
maximum number of refinement levels is fixed to ima.x = 3. Here, a Courant- 
limited timestep is again used with a Courant number of 0.2. The solution is 
integrated to a time {t = 0.2) chosen so that several cycles of coarsening and 
refinement occur (see top of Fig. 9). The initial mesh is the same as in the 
time convergence test. The spatial convergence result is presented in Fig. 7. 
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Fig. 9. Plots of diagnostic quantities vs time in the p = 8 linear advection test. Top: 
number of elements vs time. Bottom: logiQ{\\u — iia| |p/| |Wal Ip) vs time. The errors 
for the adaptive and control meshes lie on top of one another. 

The anticipated spectral decay of error can be seen in Fig. 8, which also 
includes the uniform control solutions. We see again that the adaptive solution 
error decays nearly identically to the control solution, suggesting again that 
interpolations and polynomial truncation error introduce no deleterious effects 
in the linear advection case. 

In Fig. 9, typical plots of two diagnostics are provided for the p = 8 run in 
Fig. 8. In the top of the figure is shown the number of elements as a function 
of time, and in the bottom, the error. Clearly, the adaption does not alter 
the monotonic behavior of the error. The error for the 3-control grid in the 
p = 8 case is also plotted in Fig. 9; the behavior of this error as a function of 
time is nearly identical to that for the adaptive mesh. Since the problem was 
initialized with a. K = 4x4 grid at i^am = 0, at £max = 3, the 3-control problem 
has i^T = 32 X 32 elements. The adaptive case clearly provides for a significant 
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savings in the number of d.o.f. required to compute the same solution. 

3.3 2D Burgers equation 

In this section we examine the full nonlinear c — u version of (1). The 
objective is to investigate the solution errors as the mesh resolves or tracks 
stationary and propagating fronts that are generated and sustained by the 
nonlinearity of the equation. The first problem is the familiar Burgers station- 
ary front, and the second, a radial N-wave. With the former, we investigate 
the effects of 2D adaptivity on a rotated ID nonlinear problem. In the N-wave 
problem we consider convergence properties of a fully 2D nonlinear case. 

3.3.1 Stationary Burgers front problem 

The stationary Burgers front problem is the classical solution to the non- 
linear advection-diffusion equation (1) with a planar front developing in the 
X direction. We compare the maximum derivative of the field, |5j;ti|niax, and 
the time at which the maximum occurs with the analytic solution. The clas- 
sical ID Burgers front problem for q{y,t) is cast into a 2D framework by the 
substitution u{x, t) — kq{k • x, kH) in (1) [13]. The initial conditions are 

q{y, 0) = - sin(7ry) -\- U2 sin(27ry). (19) 

For the first test we choose u — 10"^, k — V^, U2 — and use biperiodic 
boundary conditions for x e [0, 1]^. In each case the problem is initialized with 
K — A X 1 grid of a specified degree p. A BDF3/Ext3 scheme is used for the 
time derivative and advective terms, respectively. We initialize only at t — t'^ 

for this problem and integrate using a BDFl/Extl scheme to provide values at 
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Table 1 



Nonadaptive results from the stationary Burgers front problem. 

and t^. Two cases are considered: a nonadaptive case and an adaptive case 
with a maximum refinement level of ^max = 3. In the nonadaptive case, the 
x-'^-coordinates of the element vertices are at = 0, ±0.05, ±1, whereas in the 
adaptive case, the elements are initially uniform. The derivative error criterion 
is used in this problem, and the tolerance and coarsening multiplication are 
1.0 and 0.5, respectively. Table 1 presents the nonadaptive results, together 
with the results of the nonadaptive runs from [28]. The quantity |5j;M|inax is 
the maximum of the derivative, and Tmax is the time at which this occurs. We 
have verified that the L2 error of the solution is consistent with the error in the 
derivative. We note that the p — h case is considerably worse than the results 
presented in [28]. This may be due to differences between the bases used in 
the two methods [2]. We note that the errors in Tmax for our nonadaptive 
case are uniformly better than those in [28], while the errors in |9^w|max are 
comparable ioi p > 5. 

In Table 2 we present the results from the adaptive case and the so-called 
reference and control solutions. The reference solution runs on a uniform grid 
with the same number of elements as the adaptive solution at Tmax- Thus, 
it offers a solution computed with roughly as many d.o.f. as the adaptive 
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Table 2 



Adaptive, reference, and control results from the stationary Burgers front problem. 

solution, and hence requires about the same computational effort. Clearly, 
the ability to resolve the front is critical in this case; the reference solution for 
p — 5 actually diverges, and good solutions are not produced until p > 13. The 
control solutions, as expected, are all nearly identical to the adaptive solutions. 
This fact suggests that our refinement criteria enable the adaptive mesh to 
capture the formation of the front accurately, at a significantly reduced cost. 



3.3.2 N-wave problem 

The 2D Cole-Hopf transformation 

^=-2i/Vlnx. (20) 

transforms (1) into a heat equation for x- Choosing a source solution [39] 

s ^ a (x — x'^)'^ 
X(x,t) = l + -exp 

we obtain the solution to (1) immediately from (20): 

u(x,t) — - — -—^ — ^„No / . — (21) 

^ ^ t a + texp((f -fO)V4i/t) ^ ^ 

This is a radial N-wave, where x° = {V^ + i^)/2 is the location of the center. 
This solution is singular as t — > 0, so we initialize at a finite time = 5 x 10~^. 
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For this test problem, we choose u — 5x 10~^ and a — 10^. Dirichlet boundary 
conditions (2) on D = [0, 1]^ are imposed at each timestep by using (21) 
on dB>. The initial grid consists oi K = 4 x 4 elements, and we consider 
only the full adaptive case with ^max = 4. This i^ax provides a control mesh 
that is expensive to compute on; hence, we examine the temporal and spatial 
convergence of only the adaptive solutions. The refinement criteria are the 
same as in Section 3.2. 

Fig. 10 presents a time series of a typical N-wave solution illustrating the 
refinement patterns characteristic of all the runs. For simplicity only one quad- 
rant of the axially symmetric solution is presented. As the front propagates 
outward, the grid refines to track it, while in the center where the velocity com- 
ponents are planar, the grid coarsens. Note that the front does not steepen in 
this problem, as it does in the planar front problem (Section 3.3); it simply 
decays as it moves outward. 

In considering the time convergence, we set p — 14 and vary At to produce 
the error plot in Fig. 11. As was the case previously, each point in this plot 
reflects a single run integrated at the indicated fixed At and the error at 
t — 0.11. This integration time interval was chosen to provide a number of 
refinement and coarsening events. Nevertheless, the solution converges with 
At, at order (slope) 3.01. 

In order to check spatial convergence, the solution is integrated to i = 0.11 
by using a variable At and p and fixed Courant number of 0.15. In Fig. 12 
we present the final L2 error vs p. As with the linear advection case, the error 
behaves spectrally for a finite time integration. 
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Fig. 10. Surface plots of u^{x,t) solved from (1), showing x G 1]^ and K/4 = 88, 
121, 139, 172, 181, 190 as t = 0.18, 0.33, 0.48, 0.65, 0.81, 1.00. Here c = u, 
u = 5 X 10^'^, p = 8, and (21) is the initial condition. 
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Fig. 12. Plot of log^odl'" ~ ■'^allp/llt^allp), as a function of p for the N-wave test. 
4 Discussion and conclusion 



We have presented an overview of the GASpARcode, concentrating on the 
continuous Galerkin discretization of a single (generalized advection-diffusion) 
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equation to illustrate the construction of the weak and collocation operators 
and to highlight aspects of the code design. We have provided a detailed de- 
scription of the underlying mathematics and constructs for connectivity and 
a new adaptive mesh refinement algorithm that are used in the code, and we 
have shown how they maintain continuity between conforming and noncon- 
forming elements. In the process, we have established a consistent set of rules 
for refinement and coarsening. In addition, we have described the refinement 
criteria and using them, have demonstrated with several test problems the 
abihty of DARe to model accurately a variety of 2D structures that evolve as 
a result of linear and nonlinear advective and dissipative dynamics. 

The test problems suggest that for resolving isolated structures, dynamic 
adaptive refinement can offer a substantial computational savings over a either 
a pseudo-spectral or conforming spectral-element method for the same control 
resolution. But for turbulent flows, how likely is it that only a few isolated 
structures will exist? And how dependent is the time evolution of the flow 
on these structures, such that, if they are resolved, the statistical properties 
of the overall flow will be preserved using DARe? One aspect of GASpAR 
that can help answer these questions is that the fields solved for need not be 
those on which adaption criteria directly operate. The user is free to specify 
any functionals of the fundamental fields (velocity, pressure, etc.) for use in 
tagging elements for refinement or coarsening. For example, while the velocity 
is actually solved for in (1), the adaption criteria might operate on kinetic 
energy, vorticity, or enstrophy. Arguably, some fully developed turbulent fiows 
viewed in terms of the fundamental fields may be too intricate to benefit 
from DARe. Nevertheless, when viewed w.r.t. some functional, some relevant 
structures, when resolved, may allow for accurate simulation of the significant 
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dynamics of the overall flow. 



A Spectral-element formalism 

A.l Canonical ID element 

In this appendix we summarize our notation and results from the SEM 
literature. Any dependent variable u — u{^) in the domain ^ e 1] ma-Y be 
approximated by its projection VpU on the space Vp of polynomials of degree 
p, using values on any Np = p + 1 distinct nodal points ^ji 

p 

u — VpU + £pU VpU = ^ u{^j)(f)j, (A.l) 

j=0 

where SpU is the pointwise error and 

UO^Ur^TT^^j,^'^ (A-2) 

denotes the Np unique Lagrange interpolating polynomials of degree p. Now 
let the be the Gauss-Lobatto-Legendre quadrature nodes 

= {j + l)th least root of (1 - C^)-^, (A.3) 

where hj is the standard Legendre polynomial of degree j and norm (j ' + 
In this case 

MO = wj> L.(e/)L,(0/ E ^fhiO")', (A.4) 

j=0 j"=0 

and the integral 

(u), = [' uiOd^ = f2^ju{^j)+T^P<0: (A.5) 

•^-1 j=0 

where 

w, = 2/pNpLp{Q^ (A.6) 
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denotes the Gauss-Lobatto-Legendre weights, Up = _22p+i g^(p^^);' ( d/ d^^^ 
is the residual operator [37] and ^' e ] — 1, 1[. Then the mean-square error is 
bounded as 

((£,«)2)iocp^-^«f:i|«(«)f (A.7) 

q=0 

for any order Q of square-integrable derivative. Thus if u is infinitely smooth 
then VpU converges to u spectrally. 



A. 2 General ID spectral elements 

Now let [—1, 1] be subdivided and covered by disjoint ID elements 
]xk-i,Xk[ = = i^ikC-l, 1[), where 

Mi) = ^fe-i + Hi^ + 0, ke{l,---K'}, (A.8) 

is a coordinate map with inverse = 2[x — Xk-i)/hl. — l. (Other invertible 

maps may sometimes be preferable.) For xq = —1, xk^ = 1 and positive 
h\ = Xk — Xk-i one has Elf^El, — ii k k' and 

[-1,1] =UE^ (A.9) 

k=l 

Then u may be approximated by its projections Vk,pU on the space V^i ^ of 
piecewise polynomials of degree p on the E^, by using ti- values on the K^Np 
mapped nodal points 

X3,k = Mij) (A.io) 

generalized from (A. 3). That is, (A.l) generalizes to 

p 

'^=Y1 i'Pk,pU + Sk,pU) , 'Pk,pU = J2 u{Xj,k)(l)j,k: (A.ll) 

k=l j=0 
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where £k,pU = £p{u o -dk) o -dj^^, and (A. 2) generalizes to 

1) -^jjk — ■^j'jk'-i 



(A.12) 
0, otherwise. 



We have adopted the additional notations f o g{x) = f{g{x)) and l§(a;) = 1 
(x e S), (otherwise). Then (A. 5) generalizes to 

= m / u{x)dx, / u{x) dx = J2'^j,ku{xj,k) +T^k,pu{x'i.), {A.13) 

where (A. 6) generalizes to 

Wj,k = \-^^kiQ\wj, (A.14) 
nk,pu = {hl/2fP+^np{u o i9k) o and 4 G E^. 

A. 3 General d- dimensional spectral elements 

Generalizing (A. 9), assume the d-dimensional problem domain D can be 
partitioned into K disjoint images of [— 1, l]*^ = |C* I ^ 1] } 

K _ 

B=\jEk, (A.15) 

k=l 

where = 1, if) has diameter 

/t^ = max max \x^ — x'^\ (A. 16) 

and 'dkii) has inverse 'i?fe^(a;) but is not necessarily linear. Now generahzing 
(A. 11), one may approximate a field u{x) by its projections Vk,pU on the space 
'Vh,p of piecewise polynomials of degree in coordinate x^ on the E^, using 

M- values on the Nk,^ = K 0^=1 A'pM mapped nodes 

xj,k = M^j), (A. 17) 
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generalized from (A. 10), where = are d-dimensional GLL nodes. That 
is, (A. 11) generahzes to 

K 

u pa T'h,pU = 'Pk,pU, Vk,pU = J2 ui^j,k)<i>j,k, (A. 18) 

k=i jej 

where J = {J | j'* € {0, • • •p'*}}, (A. 12) generahzes to 



-L) -^^k — ,k' 

0, otherwise. 



(A.19) 



and 0j(C) = n)l=i 4'3t^{i^)- The appropriate approximation of a vector 

d 

uses cf) with entries 

r^.k = '/'j;^^^ (A.20) 
and u with entries li^i^ = u^{xf^k)- For scalars u (A. 13) generahzes to 



K 



{u)= j ■■■ j u{x)d'^x = Y, j ■■■ j u{x) d' 

K 



(A.21) 



where (A. 14) generahzes to 

d 

w^k = I det (g-) I n • (A-22) 
Finahy, variational formulation depends on the inner product from (A.21): 

K 

{U, V) = (UV) ^Y.Y1 '^3,kU{^T,k)v{Xj,k) = {u, v)^^ (A.23) 

fc=l JGJ 



for scalars, {u, v) = (t*'^, ■w'') for vectors, ^li, iT^ = S^,ju'=i {u'^''^\v'^''^'^ for 
tensors, and so forth. 
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